Progressive block-based coding for image compression

ABSTRACT

A method of image compression includes significance switching of DCT coefficients in block-based embedded DCT procedures. Bitwise digitized DCT coefficients are passed through successive significance sweeps of the whole image from the most significant down to the least significant coefficient bit planes. With each new sweep, newly significant coefficients may appear within a block, and block-masking is used to transmit the addresses of those newly significant coefficients. An off-mask may also be used. The invention further relates to a hardware or software-based image encoder.

This Application is a continuation of International Application No.PCT/GB98/00360, filed Feb. 5, 1998, now pending (which is herebyincorporated by reference).

TECHNICAL FIELD

The present invention relates to image compression and particularly,although not exclusively, to a progressive block-based embedded DCTcoder, and to a method of encoding.

BACKGROUND OF THE INVENTION

The JPEG baseline method for still image coding uses the Discrete CosineTransform (DCT) in a fixed 8×8 pixel partition. Through a linearquantization table and zig-zag scanning of DCT coefficients, theredundancy and band width characteristics of the DCT are exploited overa range of compressions. Recently, however, it has become clear that theJPEG coder is not particularly efficient at higher compression ratios,and other methods such as wavelets have produced better results whilehaving the advantage of being fully embedded. Some researchers have alsoattempted to combine DCT with zerotree quantization, usually associatedwith wavelet transforms: see Xiong, Guleryuz and Orchard, ‘A DCT-BasedImage Coder’, IEEE Sig. Proc. Lett., Vol 3, No 11, November 1996, p289.

It is an object of the present invention to advance the field of imagecompression generally, and in particular to provide an improved methodof image compression which is capable of use with well understoodtransforms such as the DCT.

SUMMARY OF THE INVENTION

According to a first aspect of the present invention this is provided amethod of image compression comprising:

-   -   (a) dividing an image to be compressed into a plurality of image        blocks;    -   (b) carrying out a two-dimensional block transform on each block        to produce a corresponding plurality of coefficient blocks;    -   (c) bitwise digitizing the coefficients within each coefficient        block to define a plurality of bit planes for each coefficient        block;    -   (d) defining a group of one or more consecutive bit planes        starting with the most significant bit plane;    -   (e) selecting those coefficients which first become significant        within the group;    -   (f) flagging the said selected coefficients;    -   (g) transmitting information representative of the positions of        the said selected coefficients; and transmitting the bits within        the group of the said coefficients; and    -   (h) repeating (d) to (g) one or more times, with each new group        starting with the most significant bit plane not previously        dealt with; and, at each repeated pass, also transmitting the        bits within the current group of those coefficients which were        previously flagged on an earlier pass.

Such a method could also be applied using the one-dimensional DCT toaudio recording.

According to a second aspect of the invention there is provided a coderfor encoding images, comprising:

-   -   (a) means for dividing an image to be compressed into a        plurality of image blocks;    -   (b) means for carrying out a two-dimensional block transform on        each block to produce a corresponding plurality of coefficient        blocks;    -   (c) means for bitwise digitizing the coefficients within each        coefficient block to define a plurality of bit planes for each        coefficient block;    -   (d) means for defining a group of one or more consecutive bit        planes starting with the most significant bit plane;    -   (e) means for selecting those coefficients which first become        significant within the group;    -   (f) means for flagging the said selected coefficients;    -   (g) means for transmitting information representative of the        positions of the said selected coefficients, and for        transmitting the bits within the group of the said coefficients;        and    -   (h) means for repeating (d) to (g) one or more times, with each        new group starting with the most significant bit plane not        previously dealt with; and means for transmitting, at each        repeated pass, the bits within the current group of those        coefficients which were previously flagged on an earlier pass.

Preferably, the encoder provides for significance switching of DCTcoefficients in block-based embedded DCT image compression. The encoderprovides output on one or more data streams that may be terminatedwithin a few bits of any point.

The invention also extends to a video coder/decoder including a coder asclaimed in claim 17 and an associated decoder, the decoder beingarranged to maintain a running record, as transmission between the coderand the decoder proceeds, of the coefficients which are currentlysignificant.

The preferred two-dimensional block transform of the present inventionis the Discrete Cosine Transform, although other transforms such as theFast Fourier Transform (FFT) or the Lapped Orthogonal Transform could beused.

It will be appreciated that in the method of the present invention theorder of transmission is not significant. It is therefore to beunderstood, except where logic requires, that the various parts of themethod are not necessarily carried out sequentially in the orderspecified in section (d) of claim 1. For example, the bits from thenewly selected coefficients could be transmitted either before or afterthe bits of those coefficients which were previously flagged on anearlier path. Similarly, the bits of those coefficients which werepreviously flagged on an earlier path could be transmitted either beforeor after the new coefficients are selected for flagging.

The bit planes are swept consecutively from the most significant bitplane to the least significant bit plane. This may either be repeatedseparately for each image block, or alternatively all blocks may bedealt with at the first bit plane, then all blocks dealt with at thesecond bit plane, and so on.

The philosophy of significance switching, as used in the presentinvention, is that the overheads introduced will be compensated for bythe savings in not transmitting bits for small coefficients until theyare switched on. Good performance might naturally be expected at highcompression ratios, but what is surprising is the excellent performanceboth for lossless compression and for compression at low ratios. Thecoder of the present invention is preferably embedded, in other wordsthe bit stream can be stopped within a few bits of any point while stillguaranteeing the least possible distortion overall. When used with anappropriate decoder, either coder or decoder can terminate the bitstream as needed, dependent upon the available bandwidth or theavailable bit budget.

The coder of the present invention has been found to out-perform thebase-line JPEG method in peak signal to noise ratio (PSNR) at anycompression ratio, and is similar to state-of-the-art wavelet coders.The results show that block transform (for example DCT) coding iscompetitive across the whole range of compression ratios, includinglossless, so that a significance-switched block coder would be capableof meeting the requirements of future image compression standards in anevolutionary manner.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention may be carried into practice a number of ways and onespecific embodiment will now be described, by way of example, withreference to the figures in which:

FIG. 1 illustrates the DCT transformation and digitization of imageblocks;

FIG. 2 illustrates the significance sweeps made through the individualbit planes in coefficient space; and

FIG. 3 illustrates the zig-zag ordering of coefficients, and use of thebinary mask.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

In the preferred the preferred method of the present invention, theimage to be encoded is first partitioned into a plurality of squareimage blocks. The partitioning may either be by way of a regular tiling,for example of 8×8 pixel blocks, or alternatively some more complextiling using blocks of differing sizes. One convenient method of tilingis to vary the block size across the image according to the power in theimage (measured by the sum of the squares of pixel intensities). Asshown in FIG. 1, each image block is then transformed using the 2Dspatial DCT (Discrete Cosine Transform) to produce a corresponding DCTcoefficient block of the same size as the original image block. Each ofthe coefficients within the coefficient block is then digitized andbitwise encoded as shown in FIG. 1. Each coefficient 10 within thecoefficient block 20 is therefore represented as a series of individualbits 30, ranging from the most significant bit (MSB) to the leastsignificant bit (LSB). The DCT transformation and digitizationpreferably follow the scaling used within the JPEG standard: seeWallace, G. K, “The JPEG Still Picture Compression Standard”, Comm.ACCM, Vol. 34, No. 4, pp.30-44, 1991. This scaling represents acompromise which accommodates the full range of possible DCTcoefficients with reasonably safe integer roundings.

The encoded integer coefficients are now required to be manipulated in aprogressive fashion, and transmitted within a data-stream that can berapidly terminated at any point. In order to achieve this, the integercoefficients are first rearranged into an ordered array using thezig-zag sequence of the JPEG standard. FIG. 3 shows a typical zig-zagsequence for an 8×8 DCT coefficient block. Each coefficient thus has anassociated linear zig-zag address which may be used to identify it. Thereordering places the coefficients in ascending order of Manhattandistance from the DC term, that is from the term in the top left handcorner of the block which is representative of zero frequency in boththe x and the y directions.

Turning now to FIG. 2, there are shown three separate coefficientblocks, numbered 1 to 3. Each block is represented by a plurality of bitplanes 1 to 4, with plane 1 containing the most significant bits foreach of the digitized coefficients, and plane No. 4 the correspondingleast significant bits. It will be understood of course, that in atypical implementation there will normally be many more than threecoefficient blocks, and many more than four bit planes: in the preferredembodiment, in fact, there may be twelve bit planes, with each bit planehaving 32×32 bits. Other sizes could also be used, for example 8×8 or16×16.

In order to determine the order in which the individual bits will betransmitted, the following algorithm is used:

Sweep DCT bit planes from MSB to LSB

-   -   Sweep all image blocks        -   Switch on significant coefficients        -   Send data bits for significant coefficients    -   Next block

Next bit plane

This algorithm may perhaps best be described by way of example, withreference to FIG. 2; in this example, it will be assumed for the sake ofsimplicity that all the values are positive integers. First, a sweep ismade across bit plane No. 1 of each of the three blocks 1 to 3, todetermine which are the significant coefficients. These will be thelargest coefficients, and as the sweep is initially being made acrossthe most significant bits, they may be determined by selecting thoseentries in which there is a one rather than a zero in bit plane 1. InFIG. 2, those are the coefficients labelled A, B, C and D; coefficientsE and F are not yet significant, since their bits in bit plane 1 arezero.

Once the significant coefficients have been selected, all of thecorresponding bits in the lower bit planes 2 to 4 are automaticallyswitched on, as indicated by the filled in squares representing the bitsof coefficients A to D. The selected bits within bit plane 1 are thentransmitted.

Next, a second sweep is made over the whole image across bit plane 2.Bits from coefficients which have already been switched on in theprevious sweep are automatically transmitted, so in this example bits1001 are transmitted, these representing the second most significantbits for the coefficients A to D. Any coefficients which newly becomesignificant at this level, are switched on, as illustrated by thecrossed squares representing the second, third and fourth bits forcoefficient E. The bits for all such newly significant coefficients onbit plane 2 are also transmitted.

Another sweep is then carried out on the third bit plane. Once again,all the bits representative of coefficients which have already beenswitched on are automatically sent: in this case, these are the thirdbits of coefficients A, B, C, D and E. At this level, coefficient Fnewly becomes significant, and accordingly the one representative of bit3 of that coefficient is also sent. At the same time, that coefficientis switched on, as indicated by the half-shaded squares representativeof the third and fourth bits of that coefficient.

Finally, a sweep is performed across the fourth bit planes. In thisexample, all of the illustrated coefficients have previously beenswitched on at a higher level, and hence all of the resultant bits111000 are transmitted.

The process continues for as many bit planes as were initially requiredto digitize and bitwise encode each individual coefficient, although thevery last bit plane may need to be dealt with as a special case, to bediscussed below. The process is progressive, in the sense that the mostimportant information is sent first, so that the transmission may bestopped part-way through if transmission time is limited and/or limitedbandwidth is available.

It has been found in practice to be more efficient to exclude the DCcomponent of each block from the above scheme, and to send thatseparately. Accordingly, in the preferred embodiment switching appliedonly to the AC terms of the coefficient block. Rather than sweepingacross each of the coefficient blocks for a particular bit plane, itwould in an alternative embodiment be possible to sweep through all thebit planes within one block before proceeding to the next block. Thecorresponding algorithm for this would be:

Sweep all image blocks

-   -   Sweep DCT bit planes from MSB to LSB        -   Switch on significant coefficients        -   Send data bits for significant coefficients    -   Next bit plane

Next block

During each significance sweep, the significant bits within each bitplane may be sent in any convenient order. For example, within each bitplane, the first bits to be send may be those for each of thecoefficients which have already been “switched on” (in other words thosecoefficients which are already significant); the addresses of anynewly-significant coefficients are then transmitted, to switch them on,followed by a stop symbol. Finally, the next bit is sent for all of thenewly-significant coefficients. Alternatively, the switches may be sentfirst followed by all the data: this has the advantage of improving therun length coding of the data.

For improved efficiency at higher compressions, the DC coefficientwithin each block is preferably sent separately, prior to thesignificance sweeps of the AC terms.

It should be understood that the system needs to transmit addressinginformation representative of the positions of those coefficients whichare significant. While this could be done simply by transmitting a listof addresses sent in zig-zag order, the applicants have determined thatthe sending of a binary mask, also in zig-zag order, can further improveefficiency. For example, referring to FIG. 3, let us assume thatcoefficients 2, 3, 6, 7, 13, 14 and 17 have become newly significant atthe bit plane level of the current sweep. Those coefficients then haveto be “turned on”, so that all the bits corresponding to thosecoefficients in the lower bit planes will in future automatically betransmitted. That requires the transmission of the addresses of thosecoefficients.

The applicant has realized that the addresses may be transmitted notdirectly but by way of a bit mask, in zig-zag order. Thus, in order toswitch on the mentioned coefficients (assuming none have previously beenswitched on), the mask 001100110000011001 may be sent. A ‘1’ in the maskindicates that that particular coefficient has newly become significant.This may be run length coded as 2020502 STOP.

The coefficients within each DCT block may be negative as well aspositive, and if appropriate negative values may be suitably bitwiseencoded using a 2's compliment representation. If the coefficient ispositive, the “1” bits are significant; if negative the “0” bits aresignificant. The first data bit sent when a coefficient becomessignificant determines the sign.

The final bit plane may need to be dealt with as a special case. Anycoefficient not previously switched on will be either 0 or −1. One wayof dealing with the final phase is to send a mask only for the −1coefficients; there is no need to send any data, as all non-significantcoefficients are 0 and all those newly switched are −1.

It should be recalled that only new coefficients have to be masked ateach pass, since once a coefficient has been switched on it remainsswitched on until the end of the procedure. The masking method isefficient, with typically fewer bits needed to transmit the switchinginformation than a direct list of coefficient addresses.

Various methods of packaging the mask prior to transmission may be used.The mask may be sent sequentially, followed by a special stop symbol toindicate its end. In an alternative scheme, a length symbolrepresentative of the zig-zag length may be sent before the mask toobviate the need for the special stop symbol. Finally, the mask could bepreceded by the Manhattan depth of its highest order coefficient; forexample in FIG. 3, the highest order coefficient has zig-zag address 17and a Manhattan depth of 5. Using Manhattan depth will result in threeextra mask bits being set (with the mask being assumed to terminate atthe end of the zig-zag line that includes the given Manhattan depth).However, the Manhattan depth requires two fewer bits at all block sizesthan would a zig-zag length. Recalling that the DCT packs most of theenergy into a small subset of coefficients, leaving a large number ofsmaller coefficients, it will be understood that the Manhattan depth ismore efficient for small values.

In each case, the switching mask will contain mostly “Off” symbols(zeros), which can be efficiently compressed using an arithmetic coder.It is particularly useful to encode the mask data using first orderpredictive adaptive arithmetic coding, since much of the data compriseslong runs of low entropy off symbols. Other methods such as Huffmancould also be used to compact the mask data prior to transmission. Runlength coding may also be used.

Depending upon the method chosen to package the mask, the output fromthe coder will take the form of one or more bit streams. In the firstarrangement mentioned above, where the mask is terminated by a specialstop symbol, the mask and symbol may be sent as one stream with the DCTdata being sent as another stream. In the alternatives, where a separatelength symbol or Manhattan depth symbol is used, it may be convenient tooutput three separate streams: one specifying the Manhattan depth, onefor the mask data, and one for the DCT data.

Where several separate data streams are used, the synchronization may becontrolled so that each stream is maintained within a few bytes ofsynchrony with the others. This allows a decoder to interrupt thetransmission at any time.

At the far end of the transmission stream, a decoder maintains a recordof the mask for each image block, giving the current status of each ofits DCT coefficients. The mask is updated at each significance pass.

In a variant of the method, an “off” mask may be maintained and sent aswell as the “on” mask discussed above. This allows the coder to avoidthe sending of bits which are so far down the bit planes as to representnoise rather than real data. In practice, once a coefficient has beensignificant for several bit planes it may well be sufficiently-welldefined for visual purposes, and it could then be turned off.

In a further more general variant of the method, the significancetesting need not be carried out for each consecutive bit plane. In manycircumstances, it may be more efficient, and may given an acceptableresult, to mask several planes at once. This reduces the overhead ofmasking each bit plane individually.

It will be understood that masking each bit plane separately ismathematically equivalent to making comparisons against a decreasingthreshold value which goes as 2^(n). Other threshold values could beused instead, providing for either wider or narrower significance steps.In one embodiment, the bit planes are divided up into groups, with eachgroup being masked separately. Depending on the application, each groupmight consist of the same number or alternatively of a different numberof bit planes. Some of the groups might comprise only a single bitplane, while others are made up of several.

It is found in practice that the mask switching algorithm describedabove performs slightly better than JPEG at low compression ratios, andsubstantially better at high compression ratios. Indeed, using a 16×16block size, performance over the whole range of compression ratios isvery similar to that achieved by the wavelet coder of Said and Pearlman:see SAID, A. and PEARLMAN, W. A,: ‘Image Compression Using theSpatial-Orientation Tree’, IEEE International Symposium on Circuits andSystems, 1993, (694), pp.279-282. The present invention thereforeprovides state of the art performance while having the advantage ofbeing usable with DCT, a transformation which is widely used andunderstood as a result of its adoption at the core technology of JPEGand all current versions of MPEG.

In a typical embodiment of the present invention, a video codec(Coder/Decoder) comprises a hardware or software based coder, and ahardware or software based decoder. Bits are transmitted progressivelyfrom the coder to the decoder, with the coder being instructed to keepsending bits until a certain compression target has been reached, or acertain distortion achieved. Using the two or three individual streamsof data previously referred to, the decoder can progressivelyreconstruct the image. As the mask data are received, the decoderupdates a record, held in memory, of which coefficients are currentlyswitched on. As data transmission proceeds, more and more coefficientsare switched on. If the process is allowed to continue until all of thedata has been transmitted, the decoder can reconstruct a lossless image,with the exception of any small rounding errors that may have occurredduring the DCT digitization process. A decoder in a multi-media systemwhich uses a progressive, embedded coder as described above can begin toreveal an image as soon as transmission commences. This is an advantageoften claimed for wavelets, but significance-switched block transformsalso have this capability.

1. A method of image compression comprising the steps of: (a) dividingan image to be compressed into a plurality of image blocks; (b) carryingout a two-dimensional block transform on each block to produce acorresponding plurality of coefficient blocks; (c) bitwise digitizingthe coefficients within each coefficient block to define a plurality ofbit planes for each coefficient block; (d) defining a group of one ormore consecutive bit planes starting with the most significant bitplane; (e) selectively flagging those, by a coder device, coefficientswhich first become significant within the a group of one or moreconsecutive bit planes of corresponding coefficient blocks of imageblocks resulting from a block transform, starting with the mostsignificant bit plane; and (f) transmitting, by the coder device,information representative of the positions of the said flaggedcoefficients and transmitting the bits within the a group of the saidflagged coefficients; and, (g) repeating steps (d) to (f) one or moretimes, with each new group starting with the most significant bit planenot previously dealt with and, at each repeated pass, also transmittingthe bits within the current group of those coefficients which werepreviously flagged on an earlier pass.
 2. A method as claimed in claim 1in which step (g) is carried out across the entire image to becompressed further comprising: repeating the selectively flagging andthe transmitting one or more times, each with a new group starting witha most significant bit plane not previously dealt with and, at eachrepeated pass, also transmitting bits within a current group of thosecoefficients which were previously flagged on an earlier pass.
 3. Amethod as claimed in claim 1 2 in which step (g) the repeating isseparately repeated performed for each image block.
 4. A method asclaimed in claim 1 2 in which the block transform is the atwo-dimensional Discrete Cosine Transform.
 5. A method as claimed inclaim 1 2 in which the block transform is the a Lapped OrthogonalTransform.
 6. A method as claimed in claim 1 2 in which the blocktransform is the a Fast Fourier Transform.
 7. A method as claimed inclaim 1 2 further including, at step (f) transmitting, by the coderdevice, mask information representative of a binary mask which definesthe positions of the said selected flagged coefficients.
 8. A method asclaimed in claim 7 in which the binary mask defines the positions of theselected said flagged coefficients within each coefficient block in JPEGzig-zag order.
 9. A method as claimed in claim 7 in which the binarymask is associated with the a mask length code to define the mask endpoint.
 10. A method as claimed in claim 7 in which the binary mask isassociated with a stop-code to define the a mask end point.
 11. A methodas claimed in claim 7 in which transmitted mask information is anentropy-coded version of the mask.
 12. A method as claimed in claim 11in which the transmitted mask information is an arithmetic coded versionof the mask.
 13. A method as claimed in claim 11 in which thetransmitted mask information is a Huffman coded version of the mask. 14.A method as claimed in claim 7 in which the transmitted mask informationis run length coded.
 15. A method as claimed in claim 1 2 in which the abinary mask defines the positions of the selected said flaggedcoefficients within each coefficient block in JPEG zig-zag order, thebinary mask is associated with a mask length code to define the a maskend point, and the mask length code defines the a mask end point zig-zagaddress.
 16. A method as claimed in claim 1 2 in which the a binary maskdefines the positions of the selected said flagged coefficients withineach coefficient block in JPEG zig-zag order, the binary mask isassociated with a mask length code to define the a mask end point, andthe mask length code defines the a Manhattan distance from a DC term tothe mask end point.
 17. A method as claimed in claim 1 2 furtherincluding the step of transmitting information representative of abinary off-mask for defining the positions of coefficients whose bitsare no longer required to be sent.
 18. A coder for encoding imagesdevice, comprising the steps of: (a) means for dividing an image to becompressed into a plurality of image blocks; (b) means for carrying outa two-dimensional block transform on each block to produce acorresponding plurality of coefficient blocks; (c) means for bitwisedigitizing the coefficients within each coefficient block to define aplurality of bit planes for each coefficient block; (d) means fordefining a group of one or more consecutive bit planes starting with themost significant bit plane; (e) means fora mechanism configured toselectively flagging thoseflag coefficients which first becomesignificant within the a group of one or more consecutive bit planes ofcorresponding coefficient blocks of image blocks resulting from a blocktransform, starting with the most significant bit plane; and (f) meansfora mechanism configured to transmittingtransmit informationrepresentative of the positions of the said flagged coefficients andtransmitting the bits within the a group of the said flaggedcoefficients; and, (g) means for repeating steps (d) to (f) one or moretimes, with each new group starting with the most significant bit planenot previously dealt with, and means for transmitting, at each repeatedpass, the bits within the current group of those coefficients which werepreviously flagged on an earlier pass.
 19. A coder device as claimed inclaim 18 in which the means for transmitting information representativeof the positions of the said selected coefficients mechanism configuredto transmit comprise comprises a binary mask means mechanism.
 20. Acoder device as claimed in claim 19 including means for transmitting atransmitting mechanism configured to transmit, as synchronized datastreams, the coefficient bits and mask information.
 21. A videocoder/decoder comprising: a coder and an associated decoder, wherein (1)the coder encoding images and comprising the steps of comprises: (a)means for dividinga mechanism configured to divide an image to becompressed into a plurality of image blocks; (b) means fora mechanismconfigured to carryingcarry out a two-dimensional block transform oneach block to produce a corresponding plurality of coefficient blocks;(c) means fora mechanism configured to bitwise digitizing thedigitizecoefficients withwithin each coefficient block to define a plurality ofbit planes for each coefficient block; (d) means for defininga mechanismconfigured to define a group of one or more consecutive bit planesstarting with thea most significant bit plane; (e) means fora mechanismconfigured to selectively flaggingflag those coefficients which firstbecome significant within the group; (f) means fora mechanism configuredto transmittingtransmit information representative of the positions ofthe said flagged coefficients and for transmitting the bits within the agroup of the said flagged coefficients; and, (g) means fora mechanismconfigured to repeating steps (d) to (f)repeat operation of themechanism configured to define a group and the mechanism configured totransmit information one or more times, with each new group startingwith thea most significant bit plane not previously dealt with, andmeans for transmitting, at each repeated pass, the bits within the acurrent group of those coefficients which were previously flagged on anearlier pass, and (2) the decoder being arranged to maintain a runningrecord, as transmission between the coder and the decoder proceeds, ofthe coefficients which are currently significant.
 22. A method asclaimed in claim 2 in which the repeating step is carried out across anentire image to be compressed.